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Divide and conquer: 
BINARY DIVISION 

You can carry out binary division, one of the most difficult operations 
for a computer to perform, in simple ixPs using low-level instructions. 
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Systems 



Performing long division in any number system can be 
painful. For example, consider how you would divide 14 into 
663 using the decimal system. Assuming that you don't have 
the calculations from Figure 1 in front of you, you would 
probably think, you can't divide 14 into 6 so you take the 
next option and divide 14 into 66. But how would you set 
about doing this division? Most people would probably hunt 
round in their minds and on their fingers, thinking some- 
thing like: 3 x 14=42, but that's too small; 5 x 14=70, but that's 
too big; 4x14=56, and that's as close as you can get. 

So now you know that the first digit of your result is 4 
(from the 4x14). Next, you subtract 56 from 66, leaving 10; 
drop the 3 down to form 103; and go through the process 
again: 6x14=84, but that's too small; 8x14=112, but that's 
too big; 7x14=98, and that's as close as you can get. Thus, 
you now know that the second digit of your result is 7. Final- 
ly, you subtract 98 from 103, leaving 5, which is too small to 
divide by 14, so you know that your result is 47 with a 
remainder of 5. (Note that you're performing an integer divi- 
sion.) 

A CPU goes through a similar 
process; it has to perform a series of iter- 
ations by taking stabs in the dark, 
checking to see if it went too far, and 
backtracking if it did. To further con- 
fuse the issue, you end up doing every- 
thing somewhat backward, using the 
same sort of tricks that you employ to 
make binary multiplications easier for 
the CPU to handle (Reference 2). 

Assume that you wish to divide a 16- 
bit dividend by a 16-bit divisor (Figure 
2). This process is difficult to follow, 
but everything will come out in the 
wash. First, you reserve a 2-byte field to 
tore your divisor and a 4-byte field to 



Figure 1 



4X14=56 ■ 
66-56=10 ■ 
DROP THE 3 TO . 
FORM 103 
7X14=98 ■ 
103-98=5 ■ 



47 

14)663 



. RESULT EQUALS 47 
PLUS 5 REMAINDER 



56 
10 
103 



Performing long division in decimal requires a little effort. 

store your result. Also, you initialize the two most significant 
bytes of the result to contain all zeros, and you load the two 
least significant bytes with your dividend (the number to be 
divided). Once you've initialized everything, you perform 
the following sequence of operations 16 times: 



This article is an excerpt from the 
author's forthcoming book, Bebop Bytes 
Back (Reference 1). 
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16-BIT DIVISOR 



Figure 2 



Perform unsigned 16-bit divisions using a shift and subtract/add technique. 
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1. Shift the 32-bit result 1 bit to the left. (Shift a logic 
into the least significant bit.) 

2. Subtract the 2-byte divisor from the most significant 2 
bytes of the result and store the answer back into these 
2 bytes. 

3. A carry flag with a value of following Step 2 indicates 
a positive result, which means that the divisor is small- 
er than the two most significant bytes of the result. In 
this case, force the least significant bit of the result to 1. 
A carry flag with a value of 1 indicates a negative result 
which means that the divisor is bigger than the two 
most significant bytes of the result. In this case, leave 
the least significant bit of the result as a (from the 
shift), add the 2-byte divisor back to the most signifi- 
cant 2 bytes of the result, and store the answer back into 
these 2 bytes. 

Whenever the carry flag contains a 1 entering Step 3, the 
divisor is too big to subtract from the portion of the dividend 
that you're currently examining. But you discover this infor- 
mation too late, because you've already performed the sub- 
traction, and you now must add the divisor back in. This 
process is known as a restoring-division algorithm for just 
this reason; you have to keep on restoring the divisor every 
time you "go too far." There are nonrestoring-division algo- 
rithms, which are somewhat more efficient, but also a tad 
more complicated — so just ignore them and hope they'll go 
away. 

Now, unless you have a size 16 brain, these exercises have 
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Figure 3 








The first cycle of your 4-bit-di vision test case consists of initial conditions (a), a shift 
(b), a subtraction (c), and an addition (d). 




The second cycle of your 4 bit-division test case also consists of initial conditions 
(a), a shift (b), a subtraction (c), and an addition (d). 



Initial conditions provide the starting point for your 
division test case. 



probably left you feeling overheated, confused, lost, anc 
alone. Don't be afraid; computer divisions can sometime 
bring you to your knees. The easiest way to understand th< 
process is to examine a much simpler test case based on 4 
bit numbers. For example, consider how you'd divide 101 1 
(11 in decimal) by 001 1 2 (3 in decimal). The first step is t< 
set up some initial conditions (Figure 3). 

Remembering that this experiment is based on 4-bit num 
bers, the four most significant bits of what will eventually b 
your 8-bit result are set to 0, and your dividend will be loade< 
into the four least significant bits of the result. Now, conside 
what happens during the first cycle of the process (Figure 4) 
Commencing with your initial conditions (Figure 4a), yoi 
then shift the entire 8-bit result 1 bit t( 
the left and also shift a logic into th 
least significant bit (Figure 4b). Next 
subtract the divisor from the four mos 
significant bits of the result (Figure 4c) 
However, this step sets the carry flag t< 
a logic 1, which tells you that you'v 
gone too far. Thus, you complete thi 
cycle by adding the divisor back int< 
the four most significant bits of th 
result (Figure 4d). As fate would have ii 
the second cycle offers another helpin 
of the same sequence (Figure 5). 

Once again, the first thing you do i 
shift the entire 8-bit result 1 bit to th 
left and also shift a logic into the leas 
significant bit (Figure 5b). Next, sut 
tract the divisor from the four mos 
significant bits of the result (Figure 5c 
This action again sets the carry flag to 
logic 1, which tells you that you'v 
gone too far. Thus, you complete thi 
cycle by adding the divisor back int 
the four most significant bits of th 
result (Figure 5d). You can only hop 
that the third cycle will do somethin 
to break the monotony (Figure 6). 

As usual, begin by shifting your 8-b 
result 1 bit to the left (Figure 6b) an 
subtracting the divisor from the fov 
most significant bits of the result (Fij 
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ure 6c). In this cycle, however, the 
carry flag contains a logic 0, so the only 
thing you have to do is force the least 
significant bit of the result to a logic 1 
(Figure 6d). Now, the excitement real- 
ly starts to mount, because you've only 
got one more cycle to go, and you don't 
appear to be closing in on a result. 

In Figure 7, you start by shifting the 
8-bit result 1 bit to the left (Figure 7b) 
and subtracting the divisor from the 
four most significant bits of the result 
(Figure 7c). Again, the carry flag con- 
tains a logic 0; thus, again, you must 
force the least significant bit of the 
result to a logic 1 (Figure 7d). 

When you divide 1 1 by 3, you expect 
a quotient (result) of 3 and a remainder 
of 2. The four most significant bits 
(which represent the remainder) con- 
tain 0010 2 (2 in decimal), and the four 
least significant bits (which represent 
the quotient) contain 001 1 2 (3 in deci- 
mal). Good grief, it works! 

Unfortunately, your division algo- 
rithm will not always perform correctly 
e it encounters negative values, so you 
.iave to perform similar tricks to those 
you used for your signed multiplication 
subroutines (Reference 2). Thus, you 
need to check the signs of the numbers 
first, change any negative values into 
their positive counterparts using 2's 
complement techniques, perform the division, and correct 
the sign of the result if necessary. 

Now assume that you're working with an 8-bit ^P and you 
wish to create a subroutine to perform signed binary division 
on 16-bit numbers. Listing 1 (pg 166) describes just such a 
subroutine, which retrieves two 16-bit signed numbers from 
the stack, divides one into the other, and places the 16-bit 
signed result on the top of the stack. (This subroutine 
assumes that any 16-bit numbers are stored with the most 
significant byte "on top" of the least significant byte.) 

Note that the 2-byte remainder from the division ends up 
in the two most significant bytes of the result (_AH_RES and 
_AH_RES+1). Thus, if you decide that you want your sub- 
routine to return this remainder, you just add two more pairs 
of LDA and PUSHA instructions in the _AH_SAVE section of 
the subroutine (just before you push the return address onto 
the stack). However, usually you create a separate subroutine 
that returns just the remainder. Alternatively, you could 
modify this subroutine to have multiple entry and exit 
points, depending on whether you wish the subroutine to 
return the remainder or the quotient. Both techniques save 
*ou from passing the remainder back and forth when you 
don't wish to use it. 

The assembly language in Listing 1 corresponds to no par- 
ticular (jlP; it is designed for the virtual p.P implemented in 
software that accompanies Reference 1 




The third cycle of your 4 bit division test case consists of initial conditions (a), a 
shift (b), a subtraction (c), and setting the least significant bit of the result to a 
logic 1 (d). 




The final cycle of your 4-bit-division test case consists of initial conditions (a), a 
shift (b), a subtraction (c), and setting the least significant bit of the result to a 
logic 1 (d). 
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Listing 1 — 16-bit signed-binary-division subroutine 



########################################################## 

# Copyright(c) Maxfield & Montrose Interactive Inc., 1996, 1997. 
# 

# The authors are not responsible tor the consequences of 

# using this software, no matter how awful, even if they 

# arise from defects in it. 
# 



# Name: 
# 

# Function: 
I 

j 

# Entry: 
# 

# 
# 
# 
# 
# 
# 

# Exit: 
# 

# 
# 

# Modifies: 
# 

# 

#Size: 
# 



_SDIV16 

Divides two 16-bit signed numbers (in the range 
-32,767 to +32,767): returns a 16-bit signed result 

Top of stack 

Most-significant byte of return address 
Least-significant byte of return address 
MS Byte of 1st 16-bit number (divisor) 
LS Byte of 1st 16-bit number (divisor) 
MS Byte of 2nd 16-bit number (Dividend) 
LS Byte of 2nd 16-bit number (Dividend) 

Top of stack 
Most-significant byte of result 
Least-significant byte of result 

Accumulator 
Index register 

Program = 226 bytes 
Data = 9 bytes 



#### 

#### Invert input values if necessary and load the output flag 
#### 

_AH_TSTA: LDA [_AH_DIV] # Load ACC with MS byte of divisor 

STA [_AH_FLAG] # and save it to the flag 

JNN LAH_TSTB] # if the divisor is positive then . 

# jump to '_AH_TSTB', otherwise .. 

LDA # ..load the accumulator with 

SUB LAH_DIV+1] #.. subtract LS byte of divisor 

STA LAH_DIV+1] # ..(no carry in) store result 

LDA . # ..load the accumulator with 

SUBC LAH-DIV] #.. subtract MS byte of divisor 

STA LAH_DIV] # ..(yes carry in) store result 

' .' ' :W • ... ■ ' ' tJi'-J ■ 

_AH_TSTB: LDA [_AH_FLAG] # Load the flag, 

XOR [_AH_RES+2] * XOR it with MS byte of dividend, 
STA f_AH_FLAG] # then store the flag again 



LDA LAH_RES+2] # Load MS dividend into the ACC 

JNN LALLLOOP] # If dividend is positive then 

# jump to '_AH_LOOP, otherwise .. 

LDA # ..load the accumulator with 

SUB LAHRES+3] # ..subtract LS byte of dividend 

STA LAH_RES+3] # ..(no carry in) store result 

LDA # ..load the accumulator with 

SUBC LAHRES+2] # ..subtract MS byte of dividend 

STA LAHJRES+2] # ..(yes carry in) store result 



tt it trir tr tt tt it tt tt tt tt it tt n tt tt rr tt tt tt it tt it if 



#### 



BLDX 


16 


# Load the index register with 16, 


#### Hold tight - this is the start of the main division loop 






# which equals the number of times 


#### 










# we want to go around the loop 


_AH_LOOP: LDA 
SHL 


[_AH_RES+3] 


# Load ACC with LS byte of dividend 
# and shift left 1 bit 


POPA 




# Retrieve MS byte of return 


STA 


LAH_RES+3] 


# Store it 


STA 


L_AH_RADD] 


# address from stack and store it 


LDA 


[_AH_RES+2] 


# Load ACC with MS byte of dividend 


POPA 




# Retrieve LS byte of return 


ROLC 




# and rotate left 1 bit 


STA 


[_AH_RADD+1] # address from stack and store it 


STA 


LAH_RES+2] 


# Store it 








LDA 


LAH_RES+1] 


# Load ACC with LS byte of remainder 

# and rotate left 1 bit 


POPA 




# Retrieve MS byte of the divisor 


ROLC 




STA 


LAH_DIV] 


# from the stack and store it 


STA 


LAH_RES+1] 


# Store it 


POPA 




# Retrieve LS byte of the divisor 


LDA 


f_AH_RES] 


# Load ACC with MS byte of remainder 


STA 


LAH.DIV+1] 


# from the stack and store it 


ROLC 
STA 


LAH.RES] 


# and rotate left 1 bit 

# Store it 



# Note that the result is 4 bytes in size (_AH_RES+0, +1, +2, 

# and +3), where _AH_RES+0 is the most-significant byte 

POPA # Retrieve MS dividend from stack 

STA LAH.RES+2] # and store it in byte 2 of result 
POPA # Retrieve LS dividend from stack 

STA LAH_RES+3] # and store it in byte 3 of result 
LDA # Load the accumulator with and 

STA |_AH_RES] # store it in byte of result then 
STA [_AH_RES+1] # in byte 1 of result 

#### 

#### Check that we're not trying to divide by zero. If we are then 

#### it's an ERROR, so just return zero and bomb out 

#### 

_AH_TSTZ: LDA j_AH_DIV] 

OR [_AH_DiV+lj 
JNZ LAHTSTA] 



- • . ''i i; 'U--> '■■ ' 

# Now we want to subtract the 16-bit divisor from the 

# most-significant two bytes of the result 
LDA LAH_RES+1] 
SUB j_AH_DIV+l] 
STA LAH_RES+1] 
LDA [_AH_RES] 
SUBC LAHJDIV] 
STA [_AH_RES] 



# Load ACC with LS byte of remainder 

# Subtract LS byte of divisor 

# Store it in LS byte of remainder 

# Load ACC with MS byte of remainder 

# Subtract MS byte of divisor (w carry) 

# Store it in MS byte of remainder 



PUSHA 
PUSHA 

JMP l_AH_RET] 



# Load MS byte of the divisor and OR 

# it with the LS byte of the divisor 

# If the result isn't zero then we've 

# got at least one logic 1, so jump 

# lo the bit to test for -ve number. 

# Otherwise push the zero in ACC onto 

# the stack twice, then jump to the 

# last chunk of the return routine 



_AH_ADD: 



# If the carry flag is zero, set the LS bit of the result 

# to logic 1 and jump to the end of the loop. Otherwise 

# undo the harm we've just done by adding the 16-bit 

# divisor back into the MS two bytes of the result 
JNC [_AH_ADD] # If carry flag not zero jump to 

LDA LAHRES+3] # to _AH_ADD, otherwise load ACC witl 
OR $01 # LS byte of result, use OR to set LS 

STA LAH_RES+3] # bit to 1, then store it and jump to 
JMP LAHjrSTL] #_AH_TSTL(testatendoftheloop) 

LDA LAH.RES+1] # Load ACC with LS byte of remainder 
ADD LAH_DIV+1] # Add LS byte of divisor (w/o carry) 

(continued onpgl6S 
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Listing 1 — 16-bit signed-binary-division subroutine (continued) 



STA LAH_RES+1] 

LDA LAHRES] 

ADDC [_AH_DIV] 

STA LAH_RES] 



_AH_TSTL: DECX 
JNZ 



LAH_LOOP] 



# Store it in LS byte of remainder 

# Load ACC with MS byte of remainder 

# Add MS byte of divisor (with carry) 

# Store it in MS byte of remainder 

# Decrement the index register. If 

# the index register isn't then jump 

# back to the beginning of the loop 



_AH_SAVE: 



#### Breathe out - this is the end of the main division loop 

#### Now check the flag and negate the quotient portion of the result 

#### if necessary (see also the notes following the subroutine) 

#### 



_AH_TSTC: LDA 
JNN 

LDA 

SUB 

STA 

LDA 

SUBC 

STA 



LAH_FLAG] 
L.AH_SAVE] 



LAH_RES+3J 
|_AH_RES+3) 


L_AH_RES+2] 
[_AH_RES+2] 



# Load ACC with the flag 

# If MS bit of flag is then 

# jump to '_AH_SAVE', otherwise . 

# ..load the accumulator with 

# ..subtract LS byte of quotient 

# ..(no carry in) store result 

# ..load the accumulator with 

# ..subtract MS byte of quotient 

# ..(yes carry in) store result 



LDA LAH.RES+3] 
PUSHA 

LDA [_AH_RES+2] 
PUSHA 



_AH_RET: LDA f_AH_RADD+l] 
PUSHA 

LDA [_AH_RADD] 

PUSHA 

RTS 

k 

_AH_FLAG: .BYTE 

_AH_RADD: .2BYTE 
_AH_DIV: .2BYTE 
_AH_RES: .4BYTE 



#### 

#### Save result on the stack and bug out of here. Remember that 
#### we're only returning the 16-bit quotient portion of the result 
#### 



# Load ACC with LS byte of quotient 

# and stick it on the stack 

# Load ACC with MS byte of quotient 

# and stick it on the stack 

# Load ACC with LS byte of return 

# address from temp location and 

# stick it back on the stack 

# Load ACC with MS byte of return 

# address from temp location and 

# stick it back on the stack 

# That's it, exit the subroutine 

# Reserve 1-byte field to be used as 

# flag to decide whether or not to 

# negate the t 

# Reserve 2-byte temp location for 

# the return address 

# Reserve 2-byte temp location for 

# the divisor 

# Reserve 4-byte temp location foi 

# the result. The MS two bytes of 

# which will contain the remainder 

# and the LS two bytes the quotient 



Choose Your uwn 
i Path In Life 





AVED Memory Products Flash DIMMs allow you to 
choose the upgrade path that best meets your needs. 

• ICs used on the DIMMs are the Intel 28FOxxS5 family of parts 

• Smart Voltage Technology'" 2.7V (Read-Only), 3,3V or 5V Vee 

• Flash DIMMS are programmed to run at 85ns 
• Automatic Power Savings Mode 




AVED Memory Products 
14192 Chambers Rd. Tustin, CA 92780 
800.778.7928 714.573.5000 714.573.5047 fax 
or call 1-800-573-ASAP for your local AH American Sales Representative 
E-mail us at sates@avedmemory.com 
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"The Best RTOSfor DSP' 



SHARC 

C3x/4x <■ C62xx 

ARM 
OakDSPCore 

Others... 



Aortic 



Eonic Systems Inc. 



B-3200 Aarschot 
BELGIUM 

Tel. (+32) 16 62 15 85 
Fax (+32) 16 62 15 84 
e-mail: info@eonic.com 



Silver Spring, MD 20904 
USA 

Tel. (301) 572 5000 
Fax (301) 572 5005 
e-mail: info@eonic.com 



http://www.eonic.com 



